AITopics | text generation

Collaborating Authors

text generation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

InvisibleInk: High-Utility and Low-Cost Text Generation with Differential Privacy

Neural Information Processing SystemsJun-23-2026, 07:57:41 GMT

As major progress in LLM-based long-form text generation enables paradigms such as retrieval-augmented generation (RAG) and inference-time scaling, safely incorporating private information into the generation remains a critical open question.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: Europe > Denmark (0.27)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)
Overview (0.67)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Government (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Breaking the Likelihood Trap: Variance-Calibrated Modulation for Large Language Model Decoding

Ding, Yuanhao, Li, Meimingwei, Arias, Esteban Garces, Aßenmacher, Matthias, Heumann, Christian, Zhang, Chongsheng

arXiv.org Machine LearningJun-23-2026

In open-ended generation, LLMs frequently fall into the "likelihood trap", marked by repetitive degeneration and vocabulary dullness, creating a discrepancy between machine-generated and human-written text. While post-hoc tail truncation (e.g., Top-$p$, Min-$p$) avoids sampling from the unreliable tail, it can over-sample from the uncalibrated head and misalign generation with human lexical preferences; fixed scalar repetition penalties likewise ignore variation in logit scale across inference steps, potentially disrupting semantic coherence. To address both limitations, we propose Variance-Calibrated Modulation (VCM), a training-free pre-decoding intervention that reshapes the probability distribution before truncation through two dynamic mechanisms: (1) Contextual Searchlight via PMI, which suppresses global stopwords while elevating context-evoked tokens, and (2) Adaptive Self-Debiasing, which uses real-time logit standard deviation for scale-invariant penalization. Across open-ended generation, factual QA, and mathematical reasoning, VCM consistently mitigates the likelihood trap. With negligible computational overhead, VCM integrates with existing decoding strategies, improving diversity, coherence, and, particularly at higher decoding temperatures, reasoning accuracy.

computational linguistic, large language model, natural language, (18 more...)

arXiv.org Machine Learning

2606.22511

Country:

Asia > Middle East > UAE (0.46)
North America > United States (0.46)
Europe > Austria (0.28)
Europe > Germany (0.28)

Genre: Research Report > Experimental Study (0.46)

Industry:

Leisure & Entertainment > Sports (1.00)
Education (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Don't Let It Fade: Preserving Edits in Diffusion Language Models via Token Timestep Allocation

Neural Information Processing SystemsJun-22-2026, 22:43:13 GMT

While diffusion language models (DLMs) enable fine-grained refinement, their practical controllability remains fragile. We identify and formally characterize a central failure mode--update-forgetting--in which uniform, context-agnostic updates induce token-level fluctuations across timesteps, erasing earlier semantic edits and disrupting the cumulative refinement process, thereby degrading fluency and coherence. As this failure originates in uniform, context-agnostic updates, effective control demands explicit token ordering. We propose Token Timestep Allocation (TTA-DIFFUSION), which realizes soft, semantic token ordering via pertoken timestep schedules: critical tokens are frozen early, while uncertain tokens receive continued refinement. This timestep-based ordering can be instantiated as either a fixed policy or an adaptive policy driven by task signals, thereby supporting a broad spectrum of refinement strategies. Because it operates purely at inference time, it applies uniformly across various DLMs and naturally extends to diverse supervision sources. Empirically, TTA-DIFFUSION improves controllability and fluency: on sentiment control, it yields >20%higher accuracy and nearly halves perplexity using <1/5 the steps; in detoxification, it lowers maximum toxicity (12.2 vs. 14.5) and perplexity (26.0 vs. 32.0). Together, these results demonstrate that softened ordering via timestep allocation is the critical lever for mitigating update-forgetting and achieving stable and controllable diffusion text generation.

artificial intelligence, natural language, text classification, (17 more...)

Neural Information Processing Systems

Country:

Asia (0.46)
Europe (0.46)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Government (0.46)
Education (0.46)
Banking & Finance (0.46)
Leisure & Entertainment > Sports (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.45)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.45)

Add feedback

COSMOS: Compressed and Smooth Latent Space for Text Diffusion Modeling

Neural Information Processing SystemsJun-15-2026, 01:18:18 GMT

Autoregressive language models dominate modern text generation, yet their sequential nature introduces fundamental limitations: decoding is slow, and maintaining global coherence remains challenging. Diffusion models offer a promising alternative by enabling parallel generation and flexible control; however, their application to text generation is hindered by the high dimensionality of token-level representations. We introduce COSMOS, a novel approach to text generation that operates entirely in a compressed, smooth latent space tailored specifically for diffusion. This space is learned using an autoencoder trained simultaneously for token-level reconstruction and alignment with frozen activations from a pretrained language encoder, providing robust semantic grounding and enabling effective perturbation-based augmentations. Empirically, we demonstrate that text representations can be compressed up to 8 while maintaining generation quality comparable to token-level diffusion models. Furthermore, increasing the latent sequence length allows COSMOS to surpass both diffusion-based and autoregressive baselines. We evaluate COSMOS on four diverse generative tasks including story generation, question generation, summarization, and detoxification and compare it with various generative paradigms. COSMOS achieves comparable or superior generation quality while offering more than 2 faster inference. Code is released at GitHub.

diffusion model, large language model, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (1.00)
Europe (0.92)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Government (1.00)
Leisure & Entertainment (0.68)
Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)

Add feedback

Constrained Discrete Diffusion

Neural Information Processing SystemsJun-14-2026, 21:54:53 GMT

Discrete diffusion models are a class of generative models that construct sequences by progressively denoising samples from a categorical noise distribution. Beyond their rapidly growing ability to generate coherent natural language, these models present a new and important opportunity to enforce sequence-level constraints, a capability that current autoregressive models cannot natively provide. This paper capitalizes on this opportunity by introducing Constrained Discrete Diffusion (CDD), a novel integration of differentiable constraint optimization within the diffusion process to ensure adherence to constraints, logic rules, or safety requirements for generated sequences. Unlike conventional text generators that often rely on post-hoc filtering or model retraining for controllable generation, CDD directly imposes constraints into the discrete diffusion sampling process, resulting in a training-free and effective approach. Experiments in toxicity-controlled text generation, property-constrained molecule design, and instruction-constrained text completion demonstrate that CDD achieves zero constraint violations in a diverse array of tasks while preserving fluency, novelty, and coherence, while outperforming autoregressive and existing discrete diffusion approaches.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: North America > United States (0.93)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine (0.67)
Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

Optimal Control for Transformer Architectures: Enhancing Generalization, Robustness and Efficiency

Neural Information Processing SystemsJun-12-2026, 09:18:46 GMT

We study Transformers through the perspective of optimal control theory, using tools from continuous-time formulations to derive actionable insights into training and architecture design. This framework improves the performance of existing Transformer models while providing desirable theoretical guarantees, including generalization and robustness. Our framework is designed to be plug-and-play, enabling seamless integration with established Transformer models and requiring only slight changes to the implementation. We conduct seven extensive experiments on tasks motivated by text generation, sentiment analysis, image classification, and point cloud classification. Experimental results show that the framework improves the test performance of the baselines, while being more parameter-efficient. On character-level text generation with nanoGPT, our framework achieves a 46\% reduction in final test loss while using 42\% fewer parameters. On GPT-2, our framework achieves a 9.3\% reduction in final test loss, demonstrating scalability to larger models. To the best of our knowledge, this is the first work that applies optimal control theory to both the training and architecture of Transformers. It offers a new foundation for systematic, theory-driven improvements and moves beyond costly trial-and-error approaches.

large language model, machine learning, natural language, (11 more...)

Neural Information Processing Systems

Technology: